
voice input portion for outputting a [vocal sample] user's 
digitally converted [from voice made] vocal sample , a 
preprocessor for extracting a characteristic vector suitable for 
phoneme division [,] from the vocal sample input [from] at the 
voice input portion, a mult i- layer perceptron (MLP) phoneme 
dividing portion for finding and outputting the phoneme border 
[of phoneme,] using the characteristic vector of the 
preprocessor, and a phoneme border outputting portion for 
outputting position information [on] regarding the phoneme border 
[of phoneme] of the MLP phoneme dividing portion in the form of 
frame position, said method comprising the steps of: 

(a) sequentially segmenting and framing a voice with 
digitalized voice samples, extracting characteristic vectors by 
vocal frames, and extracting an inter-frame characteristic vector 
of the difference between nearby frames of the characteristic 
[vectors] vector by frames, to thereby normalize the maximum and 
minimum of said [characteristics] inter-frame characteristic 
vectors ; 

(b) initializing weights present between an input layer 
and a hidden layer and between the hidden layer and an output 
layer of said MLP, designating an output target data of said MLP, 
inputting said inter-frame characteristic vectors extracted from 
a currently analyzed frame to said MLP for learning, and storing 
and finishing information on the weight obtained through learning 
and the standard of said MLP if the reduction rate of a mean 
squared error converges within a permissible limit; and 

(c) reading the weight obtained in said step (b) , 
receiving said inter-frame characteristic vectors from the 
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currently analyzed frame , performing an operation of phoneme 
border discrimination to generate [an] output [value] values , 
discriminating the phoneme border according to the output [value] 
values , and if the current analyzed frame arrives two frames 
preceding the final frame of [incoming voice] the end portion of 
the vocal samples , outputting a frame number indicative of the 
border of phoneme as a final result. 

2. (Amended) The method as claimed in claim 1, wherein 
the voice framing of said step (a) is performed by taking a 
Hamming window in a length of 16 msec every 10 msec, with respect 
to the overall [incoming] length of the end portion of the vocal 
samples . 



Please cancel Claim 3 without prejudice or disclaimer. 




Please insert the following new Claim 4, 



— 4. The method as claimed in claim 1, wherein the phoneme 
border discrimination of said step (c) is performed such that the 
difference of the output values OUT (0) ^and OUT (1) are compared 
to the threshold value determined f rom\ previous experimental 
statistics, and it is determined that if \the difference larger 
than the threshold value and OUT (0) is largW than OUT (1) , then 
the analyzed frame is the phoneme border, and if the difference 
is larger than the threshold value and OUT (l)\is larger than OUT 
(0), the analyzed frame is not the phoneme border. — 



